Require compound terms for typed literal objects#151
Merged
Conversation
Promote the typed-literal guidance from discretionary ("may, if clearer")
to directive: dates, amounts, ordinals, and plain numbers MUST be written
as compact compound terms (date()/ordinal()/amount()/number()) instead of
prose strings. Left to discretion the extractor never emits them, so the
engine cannot sort/threshold/range over values that are really comparable.
- mapping table prose -> compound term per type
- honest engine-support note: date/ordinal fully project; amount needs a
unit table and is positive-int only (use number() for negatives);
number projection still pending (#125) but emit the term for structure
- cross-reference attribute-relations.md / typed-relations.md so declared
relations actually project and compare
…ble) #125 (number-type comparison) is CLOSED: `number` projects to a fixed-point int64 scaled ×1000 (literal_types.parse_number_scaled), so date/ordinal/number/ amount all compare. The init template (factlog/cli.py) and this PR's earlier text-to-fact wording still said number was "not yet engine-projectable", which is wrong and was the source of incorrect guidance. Fixes both: - factlog/cli.py typed-relations.md template: number now documented as fixed-point ×1000 int64, positive-only, thresholds in scaled units. - text-to-fact.md: number is projectable; thresholds use scaled integers (`V >= 2000`, not `2.0`); number AND amount reject negatives (verified: parse_number_scaled('-672') -> None), so negative-capable values (e.g. an operating loss) cannot be made comparable and stay plain strings. Verified: a number KB answers `version >= 2.0` (scaled `V >= 2000`) and an unscaled float threshold fails loud via _assert_no_unscaled_number_threshold. Scaffolded typed-relations.md still parses to {} with no warning; typed_literals (9) and vocab (18) tests pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two prompt-hardening changes to
skills/factlog/references/text-to-fact.md(the authoritative extraction criteria), plus a related stale-doc fix in the
factlog inittemplate. All convert a soft "may" into a "must, when X", orcorrect an out-of-date capability note.
1. Exhaustive extraction (완전성 원칙)
Dense tables — rosters, financial/registry status, budget line items,
schedules, career/patent records — are the highest-density fact source, yet the
prior criteria only said "record relation candidates." In practice the extractor
skimmed prose and dropped repeated table rows: a real proposal with ~400
extractable facts yielded ~90 (≈20–25% coverage).
2. Typed-literal compound terms (재량 아님)
Date/amount/ordinal/number objects left as prose strings ("2017.03.08",
"126백만원") can't be sorted/thresholded by the engine. Left to discretion the
extractor never emits compound terms (observed: 0 across a full sync).
date()/ordinal()/amount()/number()for typed literals, with aprose→term mapping table
date/ordinal/number/amountallproject to comparable
int64.numberis fixed-point scaled ×1000 (3decimals), so a hand-authored threshold uses scaled integers (
V >= 2000,not
2.0; an unscaled float fails loud).numberANDamountarepositive-only, so negative-capable values (e.g. an operating loss) cannot
be made comparable and stay plain strings.
amountalso needs a unit table.attribute-relations.md/typed-relations.md3. Stale #125 note in the init template
#125 (number-type comparison) is closed/implemented, but
factlog/cli.py'styped-relations.mdtemplate still said number was "not yetengine-projectable", which seeded incorrect guidance into every new KB. Updated
to the fixed-point ×1000 reality. (An earlier revision of this PR's
text-to-fact wording repeated the same stale claim; also corrected here.)
Docs/criteria/template only — no engine code paths touched. The reference file
is read at extraction time, so changes are live without reinstall. Verified:
scaffolded
typed-relations.mdstill parses to{}with no warning;test_typed_literals.sh(9),test_vocab.sh(18) pass; a number KB answers ascaled
version >= 2.0comparison and rejects an unscaled float threshold.